Inferring fitness landscapes by regression produces biased estimates of epistasis.

نویسندگان

  • Jakub Otwinowski
  • Joshua B Plotkin
چکیده

The genotype-fitness map plays a fundamental role in shaping the dynamics of evolution. However, it is difficult to directly measure a fitness landscape in practice, because the number of possible genotypes is astronomical. One approach is to sample as many genotypes as possible, measure their fitnesses, and fit a statistical model of the landscape that includes additive and pairwise interactive effects between loci. Here, we elucidate the pitfalls of using such regressions by studying artificial but mathematically convenient fitness landscapes. We identify two sources of bias inherent in these regression procedures, each of which tends to underestimate high fitnesses and overestimate low fitnesses. We characterize these biases for random sampling of genotypes as well as samples drawn from a population under selection in the Wright-Fisher model of evolutionary dynamics. We show that common measures of epistasis, such as the number of monotonically increasing paths between ancestral and derived genotypes, the prevalence of sign epistasis, and the number of local fitness maxima, are distorted in the inferred landscape. As a result, the inferred landscape will provide systematically biased predictions for the dynamics of adaptation. We identify the same biases in a computational RNA-folding landscape as well as regulatory sequence binding data treated with the same fitting procedure. Finally, we present a method to ameliorate these biases in some cases.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Epistasis and the Structure of Fitness Landscapes: Are Experimental Fitness Landscapes Compatible with Fisher’s Geometric Model?

The fitness landscape defines the relationship between genotypes and fitness in a given environment and underlies fundamental quantities such as the distribution of selection coefficient and the magnitude and type of epistasis. A better understanding of variation in landscape structure across species and environments is thus necessary to understand and predict how populations will adapt. An inc...

متن کامل

The diversity of evolutionary dynamics on epistatic versus non-epistatic fitness landscapes

Although the role of epistasis in evolution has received considerable attention from experimentalists and theorists alike, it is unknown which aspects of adaptation are in fact sensitive to epistasis. Here, we address this question by comparing the evolutionary dynamics on all finite epistatic landscapes versus all finite non-epistatic landscapes, under weak mutation. We first analyze the fitne...

متن کامل

Measuring epistasis in fitness landscapes: The correlation of fitness effects of mutations.

Genotypic fitness landscapes are constructed by assessing the fitness of all possible combinations of a given number of mutations. In the last years, several experimental fitness landscapes have been completely resolved. As fitness landscapes are high-dimensional, simple measures of their structure are used as statistics in empirical applications. Epistasis is one of the most relevant features ...

متن کامل

The rank ordering of genotypic fitness values predicts genetic constraint on natural selection on landscapes lacking sign epistasis.

Sewall Wright's genotypic fitness landscape makes explicit one mechanism by which epistasis for fitness can constrain evolution by natural selection. Wright distinguished between landscapes possessing multiple fitness peaks and those with only a single peak and emphasized that the former class imposes substantially greater constraint on natural selection. Here I present novel formalism that mor...

متن کامل

A framework for inferring fitness landscapes of patient-derived viruses using quasispecies theory A Letter Submitted to Genetics

1 Fitness is a central quantity in evolutionary models of viruses. However, it remains difficult to determine viral fitness experimentally, and existing in vitro assays can be poor predictors of in vivo fitness of viral populations within their hosts. Next-generation sequencing can nowadays provide snapshots of evolving virus populations, and these data 5 offer new opportunities for inferring v...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:
  • Proceedings of the National Academy of Sciences of the United States of America

دوره 111 22  شماره 

صفحات  -

تاریخ انتشار 2014